IIT at TREC-10
نویسندگان
چکیده
For TREC-10, we participated in the adhoc and manual web tracks and in both the site-finding and cross-lingual tracks. For the adhoc track, we did extensive calibrations and learned that combining similarity measures yields little improvement. This year, we focused on a single highperformance similarity measure. For site finding, we implemented several algorithms that did well on the data provided for calibration, but poorly on the real dataset. For the cross-lingual track, we calibrated on the monolingual collection, and developed new Arabic stemming algorithms as well as a novel dictionary-based means of cross-lingual retrieval. Our results in this track were quite promising, with seventeen of our queries performing at or above the median.
منابع مشابه
Experiments with Query Expansion at TREC 2006 Legal Track
This paper describes the UMKC TREC 2006 Legal Track experiments. We focus on a single technique that uses cooccurrence based thesaurus to expand queries. Our results indicate this technique is effective even towards the enormous vocabulary size in the IIT CDIP collection.
متن کاملIIT Kharagpur at TREC 2008 Blog Track
This paper describes our opinion retrieval system for TREC 2008 blog track. We focused on five different aspects of the system. The first module is focussed on extracting the blog content out from junk html and thereby decreasing the noise in the indexed content. The second module aims at removing various kind of spam content from real blogs. The third module aimed at retrieving the relevant do...
متن کاملIIT at TREC 2004 Standard Retrieval Models Over Partitioned Indices for the Terabyte Track
For TREC-2004, we participated in the Terabyte track. We focused on partitioning the data in the GOV2 collection across a homogeneous cluster of machines and indexing and querying the collection in a distributed fashion using different standard retrieval models on a single system, such as the Robertson BM25 probabilistic measure and a vector space measure. Our partitioned indices were each inde...
متن کاملIIT TREC 2005: Genomics Track
For the TREC-2005 Genomics Track ad-hoc retrieval task, we report on the development of a scalable information retrieval engine based on a relational data model for the integration of structured data and text. Our objectives are to meet the need for the integrated search of heterogeneous data sets of biomedical literature and structured data found in biological databases, and to demonstrate the...
متن کاملIIT TREC-9 - Entity Based Feedback with Fusion
For TREC-9, we focused on effectiveness in the web track. The key techniques we employed were information fusion, entity-based relevance feedback, Wordnet-based query parsing and a user interface designed to assist with web-based manual queries. Our initial results are positive. For the manual task, forty of fifty queries are over the median. In the adhoc, title-only task, thirty-four of fifty ...
متن کامل